EUROPEAN ORGANIZATION FOR NUCLEAR RESEARCH (CERN) 




CERN-PH-EP/2012-226 
2012/08/20 



CMS-EWK-11-017 



Study of the dijet mass spectrum in pp — >• W + jets events 

at Js = 7TeV 



The CMS Collaboration!] 



Abstract 



A study is presented of the invariant mass spectrum of the two jets with highest trans- 
verse momentum in pp — > W+2-jet and W+3-jet events. The data sample corresponds 
to an integrated luminosity of 5.0 fb _1 collected with the CMS detector at y/s = 7 TeV. 
No evidence is found for the anomalous structure reported by the CDF Collaboration, 
and an upper limit of 5.0 pb is established at 95% confidence level on the production 
cross section for a generic Gaussian signal with mass near 150 GeV. Two theoretical 
models that predict a dijet resonance near 150 GeV are excluded. 
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The CDF Collaboration reported evidence for an excess near 150 GeV in the invariant mass 
(nijj) spectrum of the two leading transverse-momentum (px) jets produced in pp — > W+2- 
jet events [1J. The DO Collaboration carried out a similar analysis but did not confirm the 
result [2j. This letter details the search for a similar excess in the m;; spectrum using 5.0 fb -1 
of data collected from pp collisions at \/s = 7 TeV with the Compact Muon Solenoid (CMS) 
detector at the CERN Large Hadron Collider (LHC) during 2010 and 2011. 

Events are selected with one well-identified and isolated lepton (muon or electron), large miss- 
ing transverse energy £t, and exactly two or exactly three high-px jets. The selection criteria 
are similar to those used at the Tevatron dill, but modified to adapt to the higher background 
rates and different experimental conditions at the LHC. We also place more stringent require- 
ments on the jet kinematics, as suggested in Ref. |3|, to enhance any signal compared to the 
irreducible W plus jets background. We investigate three representative models, a technicolor 
7Tt from the decay of a technicolor pj (U, a leptophobic 71 decaying to two jets [5|, and the 
standard model (SM) Higgs boson produced in association with a W boson (referred to as WH 
production) and decaying to a pair of jets. The WH production cross section at the LHC is neg- 
ligible compared to contributions from other SM processes, which overwhelm any contribution 
to this analysis from WH — > £vjj decays for niu 125 GeV HEEL 

The CMS coordinate system has its origin at the center of the detector, with the z axis pointing 
along the direction of the counterclockwise proton beam. The azimuthal angle is denoted as 
<p, the polar angle as 9, and the pseudorapidity is defined as rj = — In [tan (9/2)]. The cen- 
tral feature of the CMS detector is a superconducting solenoid, of 6 m internal diameter, that 
produces an axial magnetic field of 3.8 T. Located within the field volume is the silicon pixel 
and strip tracker extending up to \t]\ = 2.5, as well as a lead tungstate crystal electromagnetic 
calorimeter (ECAL) and a brass/ scintillator hadronic calorimeter (HCAL), both extending up 
to 1 7/| =3. Outside the field volume in the forward region (3 < \rj\ < 5) is an iron/ quartz-fiber 
hadronic calorimeter. Muons are measured in gas-ionization detectors embedded in the steel 
return yoke outside the solenoid, in the pseudorapidity range \tj\ < 2.4. A detailed description 
of the CMS experiment can be found in Ref. @. 

The data were collected with a suite of single-lepton triggers, mostly with a px threshold of 
24 GeV for muons and 25-32 GeV for electrons. The trigger efficiency for the selected muons 
(electrons) is about 94% (90%). Jets and £j 13 QI3 are reconstructed with the particle-flow 
algorithm which combines information from several subdetectors. Jets are formed with 
the anti-/cx clustering algorithm [12J with a distance parameter of 0.5. We require \n e t\ < 2.4 
to ensure that they lie within the tracker acceptance, and a minimum jet px of 30 GeV. Jets are 
required to satisfy identification criteria that eliminate jet candidates originating from noisy 
channels in the hadron calorimeter |[T3"1 . Jet energy corrections ffl4l are applied to account for 
the jet energy response as a function of t] and px, and to correct for additional proton-proton 
interactions occurring within the same bunch crossing ITBl [16| . Charged-particle tracks not 
originating at the primary vertex are not considered for jet clustering. The jet px resolution 
varies from 15% at px = 40 GeV to 6% at px = 400 GeV [14J. The mass resolution ajj for a jet 
pair is 10% of m» for masses around 150 GeV. 

Muon candidates are reconstructed in the region \r\\ < 2.1 by combining information from the 
silicon tracker and the muon detectors by means of a global fit. Electron candidates are identi- 
fied within 1 77 1 < 1.44 and 1.57 < \r\\ < 2.5 as clustered energy deposits in the electromagnetic 
calorimeter that are matched to tracks. Muon and electron candidates need to fulfill quality cri- 
teria established for the measurement of the inclusive W and Z cross sections [101. m addition, 
all leptons must be well-separated from hadronic activity in the event. Jets within an rj-<p cone 
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of radius 0.3 around a lepton candidate are removed. 

Leptons from candidate W — > Iv decays must satisfy a single-lepton trigger and identifica- 
tion and isolation requirements. The muon (electron) transverse momentum must exceed 
25 (35) GeV, and £x must be greater than 25 (30) GeV in the muon (electron) analysis. The 
transverse mass Mj of each W candidate must be greater than 50 GeV, where 



and <p( and <pi T are the azimuthal angles of the lepton and respectively. Events with more 
than one identified lepton are vetoed. 

We retain events with exactly two or exactly three jets satisfying pj > 30 GeV. The leading 
jet is required to have pj > 40 GeV and point more than 0.4 rad in azimuth from the direction 
of the We further require \\pj + pj 2 \\ > 45GeV and \^ (ji, ji)] < 1-2, where the jets are 
numbered in order of decreasing pj. The selected jets and the lepton from the W decay are 
required to originate from the same primary vertex. The requirement 0.3 < p^. 2 /nijj < 0.7 is 
imposed to take advantage of the separation between resonant dijet and nonresonant W plus 
jets production observed in simulation studies. 

The selected sample is dominated by events containing a W with two or more jets. Smaller con- 
tributions come from top-pair and single-top decays, Drell-Yan events with two or more jets, 
multijet production, and WW and WZ diboson production where one W decays into leptons 
and the other W or Z decays into quarks. 

The shapes of the tnu distributions for background processes are modeled using samples of 
simulated events. The MadGraph5 1.3.30 ||T7| event generator produces parton-level events 
with a W boson and up to four partons on the basis of matrix-element (ME) calculations. The 
ME-parton shower matching scale }i is taken to be 20 GeV [18], and the factorization and renor- 
malization scales are set to q 2 = M 2 ^ + Pj W . Four alternative samples of W events are gen- 
erated with the scales increased and reduced by a factor of two with respect to those of the 
reference sample. Samples of tt and Drell-Yan events are also generated with MadGraph. 
Single-top production is modeled with POWHEG 1.0 [19|. Multijet and diboson samples (WW, 
WZ, ZZ) are generated with pythia 6.422 11201 . PYTHIA provides the parton shower simulation 
in all cases, with parameters of the underlying event set to the Z2 tune [21j. The set of par- 
ton distribution functions used is CTEQ6LL |22| . Simulated signal samples for the technicolor 
and WH models are generated with pythia, while the leptophobic Z' is generated with Mad- 
Graph. A GEANT4-based simulation [23] of the CMS detector is used in the production of all 
Monte Carlo (MC) samples. Multiple proton-proton interactions within a bunch crossing are 
taken into account, and the triggers are emulated. All simulated events are reconstructed and 
analyzed with the same software as data. 

We determine the contributions of the known SM processes to the observed mu spectrum by 
means of an extended unbinned maximum-likelihood fit in the range between 40 GeV and 
400 GeV. The fit is performed separately in four event categories, {}i, e} x {2-jet, 3-jet}, because 
the background compositions differ. The mu signal region, 123 to 186 GeV, corresponding to 
±2(7^, is excluded from this fit in order to arrive at an unbiased estimate of a possible resonant 
enhancement in this region. 

Table [1] lists the SM processes included in the fit. The W plus jets normalization parameter is 
a free fit parameter because it is by far the dominant background. The normalizations of the 
other background components are allowed to vary within Gaussian constraints around their 
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Table 1: Treatment of background shapes and normalizations in a fit to the data. The back- 
ground normalizations are constrained within the fit to Gaussian distributions with the listed 
central values and widths. 



Process 


Shape 


Constraint on normalization 


W plus jets 


MC/data 


Unconstrained 


Diboson (WW+WZ) 


MC 


61.2 pb ±10% (NLO) EU 


tt 


MC 


163 pb ±7% (NLO) 11231 


Single-top 


MC 


84.9 pb ±5% (NNLL) J26H2EI 


Drell-Yan plus jets 


MC 


3.05 nb ±4.3% (NNLO) El 


Multijet (QCD) 


data 


£t fit (described in text) 



central values. The central values for all processes except multijet are obtained from next-to- 
leading-order (NLO) or next-to-NLO (NNLO) calculations, and the constraints reflect the theo- 
retical uncertainties. The ntu distribution shapes are obtained from simulation. Multijet events 
contribute when jets are misidentified as isolated leptons. The central value of the multijet 
normalization is obtained from a separate fit to the £j distribution [10|, and the constraint is 
determined by the corresponding fit uncertainty. The shape of the m« distribution for multijet 
events is derived from data events with lepton candidates that fail the isolation requirements. 

The ntjj spectrum of the dominant W plus jets component is not well described by the default 
CMS MadGraph sample. No significant improvement is observed with the alternative W 
plus jets samples. We employ a combination of three shapes to describe this component in the 
fitting function: 

fW+jets = a -7 7 W+jets(F0' ( ? /2 ) + P -^W+jets^' 2 / <7o) 

+ (1 - a - jS) J"w+jets(>0'<7o) / 

where .Fw+jets denotes the rrijj shape from simulation. The parameters }Iq (//) and W) corre- 
spond to the default (alternative) values of y. and q, respectively while fractional contributions 
a. and f> are free to vary between and 1. We take y! = 2po or 0.5yo = 2^o or 0.5^o)/ de- 
pending on which alternative sample provides a better fit to data. Furthermore, we verify via 
pseudo-experiment simulations that the function in the above equation has sufficient freedom 
to describe the W plus jets shape in the signal region. 

Figure [l|a) shows the observed distribution for all four event categories combined, together 
with the fitted projections of the contributions of various SM processes. Figure [TJb) shows 
the same distribution after subtraction of all SM contributions from data except electroweak 
diboson WW/ WZ events. No peak is visible in the spectrum except that near 80 GeV due to 
diboson events. Figure [TJc) shows the normalized residuals. Table [2] presents the yields of 
various SM components obtained from the fit. The sum of all the contributions is compared to 
the number of observed events. All numbers except those in the last two rows are for the m;; 
range of 40 to 400 GeV. The last two rows compare the observed and predicted contributions 
in the mu range of 123 to 186 GeV. The data agree with the SM expectations, and we find no 
significant excess in the signal region. We observe a sizable deficit in the muon 2-jet data with 
respect to the prediction from our model. We do not observe similar deviations in the other 
three categories, suggesting it is a fluctuation and not a systematic bias. 

We validate the fit procedure by performing pseudo-experiments. In each experiment, we 
generate the rrijj pseudo-data of the SM processes, taking into account the correlation among 
the yields, and then fit each pseudo-data sample. The results indicate that the bias on the total 
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Figure 1: (a) Distribution of the invariant mass spectrum of the leading two jets observed in 
data. Overlaid are the fit projections of the various components. The region between the verti- 
cal dashed lines is excluded from the fit. Depicted is the number of events per GeV. (b) The 
same distribution after subtraction of all SM components except the electroweak processes 
WW/WZ. Error bars correspond to the statistical uncertainties. The hatched band represents 
the systematic uncertainty on the sum of the SM components, (c) The normalized residual, 
(data — fit)/ (fit uncertainty). 
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Table 2: Event yields determined from maximum-likelihood fits to the data. The total fit yields 
are corrected for bias. The total fit uncertainties include the corrections derived from the fit val- 
idation described in the text and the effect of correlations among the individual contributions. 



muons electrons 



Process 


2-jet 


3-jet 


2-jet 


3-jet 


W plus jets 


58919 ± 530 


13069 ± 366 


29787 ± 1153 


8397 ± 292 


Dibosons 


1236 ±114 


333 ± 32 


685 ± 65 


184 ± 18 


tt 


4570 ± 307 


9049 ± 382 


2556 ± 174 


4265 ± 253 


Single-top 


1765 ± 87 


1001 ± 50 


916 ± 46 


521 ± 26 


Drell-Yan plus jets 


1837 ± 79 


561 ± 24 


1061 ± 46 


364 ±16 


Multijet (QCD) 


29 ± 284 


0±90 


3944 ± 1133 


324 ± 160 


Fit x. 1 probability 


0.454 


0.729 


0.969 


0.991 


Total from fit 


68294 ± 307 


24013 ± 193 


38949 ± 228 


14055 ± 143 


Data 


67900 


24046 


38973 


14145 


In the signal 


region 123 < m 


u < 186 GeV (excluded from the fit) 


Total predicted 


14511 ± 125 


7739 ± 95 


7944 ± 92 


4347 ± 70 


Data 


14050 


7751 


8023 


4438 



yield is below 0.2% and that the fit underestimates the total yield uncertainty by about 30%. 
These effects are corrected for in the final result. Uncertainties in the jet energy are estimated 
using a sample of W bosons decaying hadronically in a pure sample of semileptonic tt events. 
The mean and resolution of the reconstructed dijet mass distribution in data agree within 0.6% 
with the expectation from simulation. A small difference in f>j resolution [9| between data and 
simulation affects the signal acceptance for the new physics models under consideration at the 
0.5% level. Further systematic uncertainties are due to the uncertainty of the trigger efficiency 
estimates (1%) and the estimate of lepton reconstruction and selection efficiency (2%) fTOfl . The 
uncertainty on the integrated luminosity is 2.2% If30| . 

We scrutinize the dijet mass spectrum near 150 GeV, searching for a technicolor, leptophobic 
Z', or WH resonant enhancement. We also use a generic signal model obtained by convolving 
a delta function centered at mu = 150 GeV with a Gaussian function having width equal to cryy. 
The expected number of signal events at the LHC for a given cross section at the Tevatron can 
be estimated by considering the ratio of the predicted cross sections for our reference process, 
WH production with Mh = 150 GeV. This process is dominated by quark-antiquark (qq) anni- 
hilation. As qq processes have the smallest increase in parton luminosity from the Tevatron to 
the LHC, this choice provides a conservative limit. We therefore assume 

rr WH 

dijet resonance drjet resonance u LHC 

^LHC ~~ ^Tevatron 3VH ' 

Tevatron 

where = 300.1 fb |3T] and crj^ atmn = 71.8 fb 1132 1 . A generic Gaussian signal normalized 
to cr Xevatron = 4pb corresponds to c LH c = 16.7 pb. The values of Clhc x B(X — > jj) and eA for 
the models considered are given in Table [3] 

Since we observe no resonant enhancement, we proceed to set exclusion limits using a modified 
frequentist CLg method |33"l |34| with profile likelihood as the test statistic. Inputs to the limit- 
setting procedure are the mu distribution obtained by combining the SM components from the 
fit, the observed distribution in data, and the expectation from the dijet resonance model under 
consideration. Figure |2Ta) shows the observed and expected CLg values versus cross section 
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Figure 2: (a) The observed and expected values of the CLs statistic for a generic Gaussian signal 
hypothesis with M = 150 GeV and a = 15 GeV, as a function of the dijet signal cross section, 
(b) Observed and expected 95% CL upper limits, with one- and two-sigma error bands, on the 
cross section divided by the expected values for various signal models. The limits are calculated 
using the CLg method. A value of the excluded cross section over the predicted cross section 
of less than one indicates that the model is excluded at 95% CL. Tabled lists the cross sections 
for these models. 
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Table 3: The pythia cross sections at 7TeV times branching fraction to jets (c x B) and over- 
all efficiency times acceptance (sA) for various signal models. The relative uncertainties in e 
measurements are 1-2%. 









eA 








muons 


electrons 


Signal model 


crxB (pb) 


2-jet 


3-jet 


2-jet 


3-jet 


Technicolor [4] 


7.4 


0.065 


0.020 


0.039 


0.011 


z'@ 


8.1 


0.070 


0.023 


0.042 


0.014 


WH[20| 


0.059 


0.060 


0.019 


0.038 


0.013 



for a generic Gaussian signal, after combining the results of all four event categories. We set 
a 95% confidence level (CL) upper limit of 5.0 pb and a 99.9% CL upper limit of 8.5 pb on the 
dijet production cross section for a generic resonance with WH-like eA. 

Figure |2jb) compares the 95% CL upper limits with the expected cross sections for technicolor, 
leptophobic Z', and WH (Mh = 150 GeV) signals. The technicolor and Z' models are excluded. 
Because we have minimal sensitivity to WH, we compare the limit in Fig.|2jb) to 100 times the 
SM cross section as an illustration. 

In summary, we have studied the invariant mass spectrum of the two jets with highest trans- 
verse momentum in pp — > W+ 2-jet and W+ 3-jet events, with the W decaying leptonically to 
a muon or electron. The analyzed data sample corresponds to an integrated luminosity of 
5.0 fb -1 at y/s = 7TeV. We find no evidence for a resonant enhancement near a dijet mass of 
150 GeV, as reported by the CDF Collaboration, and set upper limits on the dijet production 
cross section of 5.0 pb at 95% CL and 8.5 pb at 99.9% CL. Two theoretical models, leptophobic 
Z' and technicolor, which predict the presence of a resonant enhancement near 150 GeV, are 
excluded. 
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